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(54) Method and apparatus for scanning a document 



(57) A document image capture method and scan- 
ner, and an image processing apparatus incorporating 
such a scanner, in which a document is scanned two or 
more times. The first scan preferably provides bi-level 
image data, which is analyzed to identify blocks of uni- 
form image type (for example, text, line drawing, gray- 
scale image, or full-color image) within the document. 
The second scan, preferably performed at lower reso- 
lution than the first, provides grayscale or color informa- 
tion, which is substituted in the grayscale or color blocks, 
respectively, for the bi-level information obtained in the 
first scan. A third scan, to provide information of the third 
type, may also be performed. An operator preferably 



views an image of the document, based on the scanned 
information, to be sure that the identification and typing 
of the various blocks has been done correctly, and may 
instruct that the document be rescanned to provide new 
data for a designated portion of the document image, if 
it appears that an error has occurred. The information 
representing the document image obtained in this way 
is preferably stored using a set of linked bit maps : one 
bit map for each block. The memory capacity needed to 
store the information can be reduced further by treating 
the page and its margins as a frame, and by storing in- 
formation about the frame, and any horizontal or vertical 
lines in the document, in simple vector form. Any portion 
of the document which is just background is not stored. 
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Description 

The present invention relates to document image 
acquisition, and particularly to ensuring that the ac- 
quired image data will be of high quality and a resolution 
suitable for the content of the image, even if the image 
contains text together with halftone (grayscale levels) or 
color image, or both. 

As increasingly larger storage devices have be- 
come available, it has become possible to store a doc- 
ument not simply as ASCII text but also as afull facsimile 
image of the document. More specifically, it is now com- 
monplace to convert a document into a computer-read- 
able bit map image of the document and to store the 
bitmap image Accordingly, whereas ASCII text storage 
permitted storage and display of only text portions of 
documents, it is now possible to store a document in 
computer readable form and to display not only the text 
but also pictures, line art, graphs, tables and other non- 
text objects in the document, as well as to show the text 
in the actual font and style used in the original document. 
Likewise, it is possible to store and display documents 
such that text attributes, such as size, position, etc., are 
preserved. 

Figure 3 shows a page of a representative docu- 
ment. In Figure 3. a document page 40 is arranged in a 
two-column format. The page includes title blocks 41, 
42, 47 which include text information of large font size 
suitable for titles, text blocks 43, 44, 48, which include 
lines of text data, graphics blocks 45, 46 which include 
graphic images which are not text (in this example, they 
are a line drawing and a full-color image), a table block 
49 which includes a table of text or numerical informa- 
tion, and a caption block 50 which includes small text 
data and which is a caption associated with blocks of 
graphic or tabular information. 

Despite the technical advances mentioned above, 
however, it is still difficult to store document images in 
computer memory efficiently, because of the large 
amount of information required for even one page. For 
example, at 300 dots-per-inch resolution. An ordinary 
8V& by 11 inch black and white document requires ap- 
proximately 8.4 million bits to store a full document im- 
age (assuming that only one bit is used per dot, which 
is possible with monochrome text and line drawings, but 
not with images containing grayscale image or color im- 
age portions). Adding grayscale image or color to the 
document, or increasing the resolution at which the im- 
age is stored, can easily increase storage requirements 
to many tens of millions of bits per page. Moreover, the 
time required to retrieve those bits from storage and to 
create and display the resulting image is significant, 
even with current high speed computing equipment. The 
time is lengthened even further in situations where the 
document image is retrieved from storage in a first com- 
puter and electronically transmitted, by modem, for ex- 
ample, to a second computer for display on the second 
computer. 
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It has been conventional to scan a document com- 
bining black and white text with color image or grayscale 
image, or both, in a PC-based document management 
system using only a black and white (bi-level) scanner. 
5 Many disadvantages are attendant upon this approach, 
however. 

First, scanning a color or grayscale image in black 
and white scanning mode not only loses all the hue in- 
formation of a color original and the gradations in density 
to of both color and grayscale images, but in many cases 
results in a mere conglomeration of black blobs. Text 
and line drawings scanned in a grayscale or color mode, 
on the other hand, become very blurry, and characters 
scanned in that fashion are not legible to optical char- 
ts acter recognition processing ("OCR processing"). 

Moreover, even color scanning a grayscale image 
often produces unacceptable results. Although a color 
scanner should pick up the densities in a grayscale im- 
age well, inadequacies in the scanner may result in 

20 some "tint tainting" of the grayscale image data. That is, 
although the grayscale image is made up entirely of 
black, white and shades of gray and so has no chromi- 
nance or hue, the scanner may erroneously detect a 
slight hue in the grayscale image. This is because the 

25 color scanner cannot directly detect a gray value as 
such, but can only detect three predetermined primary 
colors, typically red, green and blue. When scanning an 
achromatic point, such as a point that is pure black, 
white or gray, the color scanner should detect exactly 

20 equal values for these three color components. In prac- 
tice, however, slightly different values for the three color 
components may be detected, due to scanner inade- 
quacies. Upon display or reproduction, the point will 
have a slight hue instead of being achromatic as it 

35 should be. 

Thus, using one type of scanning for an entire doc- 
ument that includes color, grayscale or both, in addition 
to text, is not a viable approach. 

Also, with document images (as opposed to text 

40 documents created locally in ASCII code using a word- 
processing program to begin with), it has been proposed 
to subject text portions of a document image to optical 
character recognition processing and to store the char- 
acter information so obtained in ASCII form, greatly re- 

45 ducing the amount of storage required tor the text por- 
tions. This technique, however, does not preserve any 
information regarding the type font used in the original 
document, and obviously is not applicable to non-textual 
portions of a document, or even to textual portions which 

50 are not in a font recognizable by the particular OCR 
process being employed. 

The growing importance of desktop publishing in 
the business world only makes the problems described 
above more urgent. This technique has come to depend 

55 more and more heavily on scanning as a way to capture 
material, that is, of entering text, color images and gray- 
scale images into a form usable in a desktop publishing 
system. 
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It is one object of the present invention to provide 
an apparatus and method for processing a document so 
as to capture or acquire the contents of the document 
and to store those contents for future retrieval, with re- 
duced memory capacity requirements. 

It is a separate object of the invention to provide an 
apparatus and method for processing a document to 
capture and store the document contents in such a man- 
ner as will permit convenient and quick retrieval of the 
document for display or other processing at a later time. 

It is a separate object of the invention to provide an 
apparatus and method for processing a document to 
capture the document contents in such a manner that 
text, line drawing, grayscale and color portions are each 
treated in a way suitable for each of these image types, 
and such as to prevent degradation in image quality re- 
sulting from the processing and storage of the informa- 
tion. 

It is a separate object of the invention to provide a 
single-board document scanner which meets the fore- 
going objects, and a document image management sys- 
tem using such a scanner. 

It is a separate object of the invention to provide a 
method and apparatus, and in particular a scanner, 
which meet the aforegoing objects and are suitable for 
use in connection with, or as part of, a desktop publish- 
ing system. 

In a first aspect, the invention provides an image 
scanning method and apparatus, which may be either 
an individual scanner by itself or a more elaborate ap- 
paratus or document image management system in- 
cluding the scanner, using first and second sensors, and 
a control system, and in which the control system effects 
a first scan of an image : using the first sensor, and then 
a second scan, using the second sensor. 

In another aspect, the invention provides an image 
scanning method and apparatus, which may be either 
an individual scanner by itself or a more elaborate ap- 
paratus or document image management system in- 
cluding the scanner, using a sensor system, which may 
be either one or plural sensors, and a control system, 
and in which the control system effects plural succes- 
sive scans of an image, to provide successively a com- 
bination bi-level, grayscale and color data as needed. 

In another aspect, the invention provides a scan- 
ning method and scanner or larger apparatus including 
such scanner, using first and second sensors, a detector 
which delecls image type based on the image data itself, 
and a control system. In this aspect of the invention, the 
control system causes a first scan of the image to be 
carried out using the first sensor, and then a second 
scan : responsive to detection that image content of a 
particular type is present in the imago. The second scan 
is carried out using the second sensor. 

In another aspect of the invention, there is provided 
a method and a scanner and an apparatus or system 
incorporating the scanner, using first and second sen- 
sors, a memory and an analysis and control system. In 



this aspect of the invention, the analysts and control sys- 
tem itself detects image type based on image data ob- 
tained using the first sensor. 

Upon detection of image content of a particular type 

5 in at least one portion of the document, the image is 
scanned using the second sensor. Additionally, the in- 
formation obtained from the first scan is stored in the 
memory, after which information from the second scan 
is stored in the memory, only for those portions of the 

10 image identified as being of the particular image type. 

According to another aspect of the invention, there 
is provided a method and scanner and a system and 
apparatus incorporating such scanner, using first and 
second sensors, a display and a control system, in 

15 which information obtained by scanning the image using 
the first sensor is displayed, after which a second scan 
is performed using the second sensor, responsive to en- 
try of an instruction by an operator for such second scan. 
In another aspect of the invention, there is provided 

20 a method and a scanner and an apparatus and system 
incorporating the scanner, using first and second sen- 
sors, and an Analysis and control system in which image 
information obtained from a first scan of the document 
using the first sensor is analyzed to identify portions of 

25 the image as having various image types. Also, accord- 
ing to this aspect of the invention, a determination is 
made that image content of first and second types is 
present in at least first and second respective portions 
of the document, and a second scan is performed, in 

30 which the second sensor is used. In addition, in this as- 
pect of the invention, the information obtained in the first 
scan is initially displayed, and after the second scan, 
information from that scan is used in the display, but only 
for those portions of the image identified as being of the 

35 second image type. 

According to still another aspect of the invention, 
there is provided a method and a scanner and an appa- 
ratus system incorporating the scanner, using first and 
second sensors, a memory and an analysis and control 

40 system, in which data obtained by scanning the image 
with the first sensor is used to identify portions of the 
image as being of various image types. A second scan 
is performed, using the second sensor, responsive to a 
determination that image content of a particular type is 

45 present in at least one portion of the document. Moreo- 
ver, image data obtained by the first sensor is stored in 
the memory initially, and thereafter information obtained 
by the second sensor is stored in the memory, only for 
those portions of the image identified as being of the 

50 particular image type. According to this aspect of the in- 
vention, the image data stored in the memory in the form 
of respective bit maps for respective portions of the im- 
ago, and those bit maps arc linked in the memory. 
According to still another aspect of the invention, 

55 there is provided a method and a scanner, and of an 
apparatus and system including the scanner, using a 
color sensor, a memory and an analysis and control sys- 
tem, in which a scan of the image is performed using 
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the color sensor, after which, responsive to detection 
that the image contains grayscale image in at least one 
portion, the color image data obtained tor that portion is 
converted to grayscale data. Also, according to this as- 
pect ot the invention, information obtained by the color 
sensor is stored in the memory for non-grayscale por- 
tions of the image while the grayscale image data is 
stored for those portions identified as being grayscale 
image. 

According to another aspect of the invention, there 
is provided a method and a scanner and apparatus or 
system incorporating the scanner, in which portions of 
a document are identified as being of respective image 
types, and image data representing the document is 
stored in a memory, and in which the image data is or- 
ganized in a set of linked bit maps each containing in- 
formation of only one image type and pertaining to only 
one of the identified portions of the document. 

These and other objects, features and advantages 
of the invention will be more fully understood from the 
following detailed description of the preferred embodi- 
ments, taken in conjunction with the accompanying 
drawings. In the drawings, it is to be understood that like 
elements are indicated by like reference characters. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a perspective view of an apparatus, in- 
corporating a scanner, according to the present inven- 
tion. 

Figure 2 is a block diagram showing schematically 
the construction of the apparatus of Figure 1. 

Figure 3 is an example of a page containing a mix- 
ture of text and color image. 

Figure 4 is a flow chart illustrating the overall oper- 
ation of the system of Figure 1 to scana document. 

Figure 5 is a schematic view of the scanner shown 
in Figure 1. 

Figure 6 is a block diagram showing schematically 
the construction of the scanner of Figure 5. 

Figures 7A through 7C are flow charts illustrating in 
more detail the operation of the scanner of Figure 5 to 
carry out the process of Figure 4. 

Figure 8 is a view of the page shown in Figure 3, as 
analyzed during scanning by the scanner of Figure 5. 

Figure 9 is a flow chart illustrating the conversion of 
color image information into grayscale information in the 
first embodiment. 

Figure 10 is an illustration of block information de- 
rived from the scanning of the page shown in Figure 3. 

Figure 1 1 is a schematic view of a second embodi- 
ment of a scanner according to the invention. 

Figure 12 is a block diagram illustrating the con- 
struction of the scanner of Figure 11. 

Figure 1 3 is a flow chart illustrating the operation of 
the scanner of Figure 11 . 

Figure 1 4 is a schematic view of a third embodiment 
of a scanner according to the invention. 



Figure 15 is a block diagram illustrating the con- 
struction of the scanner of Figure 14. 

Figure 16 is a schematic view of a fourth embodi- 
ment of a scanner according to the invention. 
5 Figure 17 is a block diagram illustrating the con- 
struction of the scanner of Figure 16. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

10 

Figures 1 and 2 show an apparatus according to the 
present invention, specifically, a document image man- 
agement system. 

As shown in these figures, reference numeral 10 

*5 designates personal computing equipment such as an 
IBM PC or PC-compatible computer. Computing equip- 
ment includes a CPU 11 such as an 80386 or 80486 
processor (or any other sufficiently powerful processor) 
which executes stored program instructions such as op- 

20 erator selected applications programs that are stored in 
ROM 12 or specialized functions such as start-up pro- 
grams or BIOS which are stored in RAM 14. Computing 
equipment 10 further includes a local area network in- 
terface 15 which provides interface to a local area net- 

25 work 16 whereby the computing equipment 10 can ac- 
cess files such as document files on a remote file Server, 
send files for remote printing, have remote machines ac- 
cess document images on equipment 10, or otherwise 
interact with a local area network in accordance with 

30 known techniques such as by file exchange or by send- 
ing or receiving electronic mail. 

Computing equipment 1 0 further includes a monitor 
1 7 for displaying graphic images and a keyboard/mouse 
1 9 for allowing operator designation of areas on monitor 

35 17 and inputting information. 

Mass storage memory 20, such as a fixed disk or a 
floppy disk drive, is connected for access by CPU 11. 
Mass storage 20 typically includes stored program in- 
struction sequences such as an instruction sequence for 

40 indexing, retrieving and displaying documents, as well 
as other stored program instruction sequences for exe- 
cuting application programs such as word processing 
application programs, optical character recognition pro- 
grams, block selection applications programs, spread- 

45 sheet application programs, and other information and 
data processing programs. Mass storage memory 20 as 
shown further includes document index tables which 
contain index information by which documents can be 
retrieved, as well as bit map images of documents, doc- 

so ument structures, and ASCII text for text areas of the 
documents. Other data may be stored in mass storage 
memory 20 as desired by the operator. 

A modem 21 , a facsimile interface 22, and a voice 
telephone interface 24 are provided so that CPU 11 can 

55 interface to an ordinary telephone line 25. The modem 
21 , facsimile interface 22, and voice telephone interface 
24 are each given access to the telephone line 25 via a 
telephone line switch 26 which is activated under control 
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by CPU 11 so as to connect telephone line 25 to the 
modem 21 , the facsimile 22, or the voice telephone in- 
terface 24, as appropriate to the data being sent and 
received on the telephone line. Thus, CPU 11 can send 
and receive binary data such as ASCII text files or doc- 
ument image files via modem 21. The CPU 11 can be 
controlled by a remote computer via modem 21, it can 
send and receive facsimile messages-via facsimile in- 
terface 22, and it can interact on an ordinary voice tel- 
ephone line via voice telephone interface 24. In this re- 
gard, voice telephone interface 24 is provided with a DT- 
MF decoder 24A so as to decode tones on the voice 
telephone line 25 which correspond to operator depres- 
sions of keys on a telephone keypad. In accordance with 
stored program instruction sequences in mass storage 
memory 20, the decoded tones are interpreted by CPU 
11 into operator commands, and those operator com- 
mands are executed so that predesignated actions are 
taken in accordance with operator depressions of the 
telephone keypad keys. 

A conventional texMo-speech converter 27 is con- 
nected to the CPU 11 . The text-to-speech convertor 27 
interprets text strings that are sent to it and converts 
those text strings to audio speech information. The text- 
to-spcech convertor 27 provides audio speech informa- 
tion either to speakers 28 for enunciation to a local com- 
puter operator, or provides audio speech information to 
the voice telephone interface 24 for enunciation over or- 
dinary voice telephone lines. 

MIDI ('Musical Instrument Digital Interface") syn- 
thesizer 30 is also connected to CPU 11 and interprets 
MIDI music commands from CPU 11 so as to convert 
those MIDI music commands to audio wave forms. The 
audio wave forms are, in turn, played out over speakers 
28 or provided to voice telephone interface 24 for play 
out over ordinary voice telephone lines. 

Scanner 31 operates to scan original documents 
printed on paper sheets or other recording media, and 
to convert the information contained in those original 
documents into a bit-by-bit computer readable repre- 
sentation of each such document. Scanner 31 has black 
and white (bi-level) scanning capability, but also in- 
cludes grayscale processing capabilities or color 
processing capabilities, or both, as described below. 

Printer 32 is provided to form images of documents 
under the control of CPU 1 1 . Printer 32 may be an ordi- 
nary black and white printer, but, more preferably, printer 
32 includes color and/or grayscale printing capabilities. 

A CD-ROM 34, such as an optical disk, is connected 
for access by CPU 1 1 . The CD-ROM 34 operates to sup- 
plement the storage in mass storage memory 20 and 
contains additional information concerning document 
images, document indexes and document structure. It 
is also possible to provide a write-once-read-many 
("WORM") optical device or an ordinary read/write opti- 
cal device so as to further supplement the storage ca- 
pabilities of the apparatus. In addition, via the local area 
network 1 6, CPU 1 1 can access document indexes, doc- 
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ument images and document structure stored at remote 
file server locations, and via modem 21, CPU 11 can 
access document indexes and document images stored 
at centralized data base locations over ordinary voice 
s telephone lines. 

Figure 3, mentioned above, is an illustration of what 
a typical page of an input document might look like. As 
shown in Figure 3, it is common for such a page to in- 
clude text portions (not necessarily all in the same style 
io of font or of the same size of type), as well as graphs or 
other line drawings, grayscale images (i.e., black and 
white or other monochrome images in which gradations 
between pure black and pure white are expressed), and 
full-color images may be present. In the example shown 
is in Figure 3, several areas of text are present, including 
different fonts and print sizes, as well as a color photo- 
graph (indicated in the illustration in black and white) 
and a line drawing (in this instance, a line graph). 

The operator whose task it is to input the document 
of Figure 3 into a document image management system 
database using the equipment shown in Figure 1, per- 
forms this job as follows. First, the document is placed 
on the scanner 31 , and is scanned by that unit. The in- 
formation obtained in this manner is displayed on the 
monitor 1 7 for review by the operator, and, if dissatisfied 
with the manner in which the system has input the infor- 
mation, the operator can designate particular areas of 
the document to be reprocessed. When the operator is 
satisfied with the information acquired in this manner, 
the information is stored, lor example, in the mass stor- 
age memory 20. 

This basic process is illustrated in the flow chart of 
Figure 4. In step S1 , the operator places the document 
in the scanner 31 and instructs the scanner to com- 
mence operation. This may be done by pressing a but- 
ton on the scanner itself, or by entering a command via 
the mouse or the keyboard. The scanner reads the en- 
tire surface of the document with light from a light 
source, either by scanning the surface with the light 
beam or by moving the document past a stationary read- 
ing position illuminated by the light source. The light re- 
flected from the document varies in intensity in accord- 
ance with the image content of the particular point on 
the image from which it has been reflected. In a black 
and white (bi-level) portion ol a document, for instance, 
the reflected light will have one of two intensities, de- 
pending on whetherthe particular point on the document 
is black or while. When scanned on a grayscale portion, 
the reflected light beam intensity will vary between those 
two extreme values, according to the density of the 
scanned point. In a color portion, a white light beam from 
the light source will contain three primary-color compo- 
nents (e.g., rod, green and blue), the intensity of each 
of which will vary depending on the density of the cor- 
responding primary-color component at the point in 
question on the document. Thus, the reflected tight 
beam conveys information, in the form of intensity vari- 
ations, from which the scanner determines document 
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image information, pixel by pixel. 

This information is output from the scanner 31 in the 
form of a digital signal representing information as bi- 
level information for each pixel. That is, for each pixel, 
a single bit indicates whether the scanner has evaluated s 
the pixel as black or as white. The scanner 31 converts 
this information, which is gathered at relatively high den- 
sity (for example, 200 dots per inch) into information of 
a resolution suitable for display on the monitor 17 (for 
example, 60 or 80 dots per inch). This information is then 10 
displayed on the monitor 17, preferably at such a size 
that half of a page or more is visible on the monitor at 
once. 

In step S2, the scanner 31 analyzes the bi-level in- 
formation to identify various blocks of features on the is 
document. For example, the algorithm used for this pur- 
pose will identify the blocks indicated in Figure 3. The 
scanner 31 also analyzes the contents of each block to 
determine whether each block is text (which can suitably 
be handled as bi-level information), like block 43, or 20 
whether it is grayscale or color information (both requir- 
ing several bits per pixel for proper representation), like 
block 46. Once the location, size, shape and type of 
each block have been determined in this manner, the 
monitor 1 7 displays the information taken from the doc- 2s 
ument itself, and preferably also displays an indication 
of the block boundaries and perhaps of the nature of 
each block. 

In the bi-level scanning step S1, the information is 
first scanned to produce bi-level information, that is, one 30 
bit for each pixel. For text and (black and white) line 
graphics, this is an appropriate scanning method, and 
the data obtained in this fashion will be suitable for stor- 
age and subsequent processing. The grayscale and 
color areas, however, cannot be properly represented 35 
by bi-ievel information without great loss of image con- 
tent and quality. Therefore, after the initial scan to pro- 
duce bi-level information, the scanner 31 performs a 
second scan, in step S3, to obtain color information from 
the document. For each pixel, the information obtained 40 
in this scan includes a multi-bit datum for each of three 
primary colors, for example, 8 bits each for red, green 
and blue color components. The color information ob- 
tained in the color scan is substituted for the correspond- 
ing bi-level information for those pixels lying in areas *s 
identified as color or grayscale image, and this substi- 
tution is displayed on the monitor 17 as well. 

The operator reviews what is displayed on the mon- 
itor 17 as a result of these scans. If there are any over- 
lapping blocks resulting from the analysis algorithm, of so 
if any regions appear to have been misctassified, such 
problematic areas can be designated by the operator, 
in step S4, using mouse or keyboard controls 1 9, for ex- 
ample, and the operator instructs that each designated 
area be reprocessed, typically by rescanning (other pos- 55 
sibilities are explained below). In this case, the operator 
also designates what type of scan should be performed 
for the designated area (color or bi-level). The informa- 



tion obtained from this new scan is substituted for that 
previously present in the designated area. 

Once the displayed area meets with the operator's 
approval, the operator reviews the rest of the page (if 
any) and, when that also is satisfactory, enters a "store" 
instruction. The information for the document is then 
sent by the scanner to the CPU 11 for storage, in mass 
storage memory 20 or in CD-ROM 34. 

Alternatively, the information can be sent to a re- 
mote location via the local area network 16, or via the 
fax interface 22 and telephone line switch 26. 

In the preferred embodiments, the information for 
the page is stored by means of respective bit maps for 
the different blocks, and these bit maps are linked to 
each other by an appropriate set of pointers, to form a 
single image file. This approach can be easily accom- 
modated using the TIFF standard. Other manners of 
storage, however, are also contemplated, and ordinary 
DIB storage, for example, may be used. 

Figure 5 is a partial cross-sectional view of a scan- 
ner according to the first embodiment of the invention. 
As shown in this Figure, the scanner is provided with a 
transparent platen 51 on which the document 52 is 
placed face down. The platen 51 and document 52 are 
illuminated from below by a light source 53, which is typ- 
ically a halogen lamp, for example. The light source 53 
illuminates the full width of the document, preferably, 
and a scan of the entire document is performed by ef- 
fecting relative motion between the document 52 and 
the light source 53. While this can be done by moving 
the platen 51 with the document resting on it, past the 
light source 53, it is also possible to scan by holding the 
platen 51 stationary while the light source 53 is moved. 
In the latter approach, which is adopted in the embodi- 
ment of Figure 5, the light source 53 is moved parallel 
to the underside of the platen 51 at a speed v, and a first 
mirror 54 is traversed parallel to the light source 53, but 
at a speed v/2. The traversing mirror 54 is oriented to 
receive the light reflected from the portion of the docu- 
ment which is most brightly irradiated, and directs that 
fight to a second mirror 55, which in turn reflects the light 
to two further mirrors 56 and 57, one of which (mirror 
56) is movable, and which cause the light to impinge on 
one or the other of two parallel light sensors 58 and 59, 
respectively. 

In this embodiment, both light sensors 58 and 59 
are linear (one-dimensional) arrays of CCD elements, 
which are well known in the art. Depending on the ori- 
entation of the movable mirror 56, the light either pro- 
ceeds directly to the first CCD sensor 58, or, if mirror 56 
is retracted from the path of the light, to the second CCD 
sensor 59. The first sensor 58 is divided into relatively 
small pixels, to provide information having a high reso- 
lution, preferably at least 200 dots per inch. This sensor 
provides the bi-level information. The pixels of the sec- 
ond CCD sensor 59 are larger than those of the first, 
providing a lower-resolution output. In order to output 
color information, each pixel of the second sensor 59 is 
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covered with a color filter (not shown) that is either red, 
green or blue. Light which reaches any of the pixels in 
this sensor does so only after passing through one of 
these colorfilters, and thus provides information relating 
to one of these three primary-color components. The fil- 
ters of these three colors are arranged in an alternating 
pattern, in a manner well known to those in the art, so 
that each group of three adjacent pixels of the second 
sensor 59 includes one pixel each to receive red, green 
and blue light. In this manner, by directing the light re- 
flected from the document to a particular one of the CCD 
sensors, the scanner obtains either bi-level or color im- 
age information. 

It will be appreciated that movable mirror 56 can be 
replaced with a half mirror or other similar beamsplitter, 
although the resulting arrangement has the disadvan- 
tage that each sensor will receive a less-intense irradi- 
ation than in the arrangement described above. 

In each sensor 58 and 59, the radiation impinging 
on each pixel causes the formation of charges in that 
pixel. After a predetermined length of time sufficient to 
accumulate a readable amount of charge, the accumu- 
lated charges are read out from the sensor. Preferably 
this is done by reading out the charges from all the pixels 
of tho sensor in parallel, to an analog shift register (not 
shown), from which they are then shifted out in series. 
The resulting charges are read out as currents propor- 
tional to the amount of accumulated charge, which can 
be (and generally are) converted to a voltage signal by 
conventional circuitry. The voltage signal, which is still 
in analog form, is then converted to digital form. 

In this manner, the information from the bi-level 
CCD sensor 58 becomes a simple binary bit stream, 
with one bit of information for each pixel. The color data, 
in contrast, is digitized in such fashion as to produce 
several bits per pixel. Typically, eight bits will suffice for 
each color component for each pixel in the color infor- 
mation. 

A page memory 61 : sufficient to hold the bi-level in- 
formation for an entire page of predetermined size, is 
provided (see Figure 6) : and stores the bi-level informa- 
tion for the entire document. In this embodiment, page 
memories 62R, 62G and 62B are also provided for the 
color data. It will be appreciated that the page memory 
for each color component of the color data is several 
times as large as that for the bi-level data, since several 
times as many bits per pixel are required. 

Document Capture in the First Embodiment 

When the operator places a document 52 on the 
platen 51 and enters an instruction to commence scan- 
ning (this instruction may bo entered either through the 
keyboard or mouse 19 shown in Figure 2, or directly by 
means of a button or the like provided for this purpose 
in the scanner 31 ), movable mirror 56 is positioned in 
such manner as to cause light reflected from the under- 
side of the document to go to the first CCD sensor 58. 



After one line of data is read by the sensor 58, the line 
of data is read out from the sensor as described above 
and stored in the page memory 61 for bi-level data. The 
document is read in this way one scan line at a time, 

s until the entire document has been scanned and the re- 
sulting data bi-level has been stored in the page memory 
61 . This information is copied, in the illustrated embod- 
iment, into a document image page memory 63, for a 
purpose described below. 

10 The scanner CPU 64 now processes this informa- 
tion to identify blocks of common image type in the page. 
That is, the page image is analyzed by the scanner to 
identify regions, preferably rectangular, containing all 
text, all full-color or grayscale image, etc. This analysis 

15 is carried out using an algorithm devised by one of the 
present inventors, and disclosed in detail in EP-A- 
0567344 (US Serial No. 07/873,01 2), the disclosure of 
which is hereby incorporated herein by reference. Of 
course, any other algorithm which will perform the de- 

20 sired analysis may be used instead, but the mentioned 
one is the preferred manner for carrying out this part of 
the invention. The blocks which result are illustrated in 
Figure 8. 

In the preferred algorithm, briefly, blocks of pixel im- 

2S age data are identified, or selected, by checking each 
pixel to see which adjacent pixels have the same bi-level 
value (step S201 in Figure 7A). This indicates (usually 
small) regions each made up of connected black pixels. 
Then, contours of such connected components in the 

30 pixel image data are outlined (step S202). For each such 
group, a determination is made as to whether the out- 
lined connected components include text or non-text 
units, based on the size of the outlined connected com- 
ponents (step S203). Text units are then connected se- 

35 lectively, widthwise, to form text lines based on proximity 
of adjacent text units, and the resulting text lines are 
then selectively connected vertically to form text blocks, 
also based on proxim ity of adjacent text lines and on the 
presence or absence of non-text units between text lines 

40 (step S204). 

Text blocks are segmented into text lines of pixel 
data by dividing the blocks into columns (although in 
some cases no such division is necessary), based on 
the horizontal projection of pixel density across the col- 

45 umn, and characters are then cut from the segmented 
lines (step S205). The cut characters can then be sub- 
jected to OCR, and character codes for each can be de- 
rived based on such recognition processing. 

The monitor 17 then is caused to display an image 

50 of the page. The image data actually used to control the 
display is arranged in the video memory VMEM 65, 
which under the control of the scanner CPU 64 is sup- 
plied with information for display from the document im- 
age page memory 63. 

55 in practice, most monitors are not large enough to 
display an entire page at a legible resolution. Accord- 
ingly, it is contemplated that a portion of a page, prefer- 
ably at least one-half of the page, will be displayed at 
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one time. In any event, this display will be at a resolution 
considerably below that with which the image was 
scanned by the bi-level scanner 58, because most mon- 
itors cannot display at 200 dpi. Once the necessary 
change in resolution is effected, for example to 60 or 80 
dots per inch, the resulting lower-resolution bi-ievel data 
is supplied to the video memory 65 and the monitor 17 
and displayed. Preferably, data is added to this lower- 
resolution bi-level data, to show the outlines of the 
blocks identified by the above-mentioned algorithm 
along with, preferably, an indication in each as to what 
type of image content has been identified, as residing 
within the block. This identification can be by means of 
a predetermined symbol, or the outline of each region 
can be indicated in a different manner depending on im- 
age content (for example, a dashed line around text and 
line-drawing content, versus a dotted line or the like 
around portions of color or grayscale data). 

While the analysis of the bi-level data is being per- 
formed as described above, the scanner 31 moves mir- 
ror 56 so as to direct the light from the document to the 
color sensor 59. The document is now scanned a sec- 
ond time to obtain color information (step S301 in Figure 
7B), which is provided one line at a time by the color 
sensor 59 to the color page memory 62, in the same 
manner as was done with the bi-level information. For 
each pixel, the three eight-bit signals representing the 
three color component values for that point are supplied 
from the sensor 59 in series, to the color information 
page memory 62, where the three eight-bit signals are 
respectively stored in the red, green and blue portions 
62R, 62G and 62B of that memory. 

For each area identified as being color/grayscale in 
the page, the bi-level information obtained in the first 
scan and stored in the document image page memory 
63 is replaced with the corresponding color information 
obtained in the second scan (step S302). The informa- 
tion in the video memory 65, and so the display, is also 
updated to show this color information. Thus, as the 
color information is received, the color/grayscale areas 
are displayed on the basis of the color information ob- 
tained in the second scan. 

It should be noted that in this embodiment the color 
scan is performed at a lower resolution than the bi-level 
scan., for example, at 60-80 dots per inch. This differ- 
ence in resolution is not essential, but reduces the mem- 
ory capacity requirements. The present invention is not, 
of course, limited to use ol a color scan at a lower res- 
olution than the bi-level scan. In some applications, in- 
volving high-quality image reproduction, it may in fact 
be preferable to perform the color scan at the same res- 
olution as the bi-level scan, or even at a higher one. 

If performed at a lower resolution than tho bi-level 
scan : as in the present embodiment, the color scan may 
yield data very close to the proper resolution for the 
monitor 17. and if the two resolutions are the same, it 
will be appreciated that no resolution conversion has to 
be performed on the color data to effect display thereof. 



The scanner 31 now performs a second analysis, 
to determine which if any of the color/grayscale areas 
are actually a grayscale rather than color image, and 
then converts the image data obtained for those areas 
s in the second scan, to true grayscale information. 

To identify the true grayscale regions, the scanner 
CPU 64 executes the following steps. For each pixel in 
a block which has been identified as color/grayscale, the 
scanner 31 compares the three color-component values 
io obtained for the pixel. In grayscale image, those three 
values should ideally be identical and in practice will not 
be very different from one another. If the scanner deter- 
mines that the pixels in such block all meet this criterion, 
i.e. , have their R, G and B values either identical or with- 
15 in some predetermined range of each other, the scanner 
decides that the block is grayscale rather than color. 
This is shown in step S303, where the scanner subtracts 
both the green and the blue values G, B for the pixel 
from the red value R, and takes the average of the two 
differences. The scanner then tales the absolute value 
of this average J j( and, in step S304, divides the sum of 
these values for all the pixels in the block by the number 
n of those pixels. This number is then compared to a 
threshold value T If the average is less than T, the scan- 
ner decides that the block contains grayscale rather 
than color image (step S305). T preferably can be set 
by the operator, either by means of a special control pro- 
vided on the scanner 31 itself for this purpose or through 
the keyboard or mouse 1 9 of the document image man- 
agement apparatus shown in Figure 2. A value of 8 is 
thought to be appropriate in most cases (assuming eight 
bits per pixel per color component, as in the present em- 
bodiment). 

If the block is thus found to be grayscale, the scan- 
ner CPU 64 then converts the color data for the pixels 
in this block to grayscale information (step S306), and 
the processing is repeated for any additional color/gray- 
scale blocks in the document image page memory 63. 

The conversion of the color to true grayscale infor- 
mation is performed as follows. The scanner reads the 
color component data R, G and B and takes the arith- 
metic average H of those data for the first pixel in the 
block (steps S501 and S502 in Figure 9). This average 
H is assigned in place of the R, G and B values for the 
pixel in the document image page memory 63 in place 
of the previous values scanned. Thus, H serves as a 
grayscale data value for that pixel, representing a shade 
of gray, or while or black. This conversion eliminates any 
slight tinting of the image for the grayscale and that may 
have been present in the color data due to peculiarities 
or irregularities in the color sensor 59. 

At this point, the display on the monitor 17 should 
reflect very closely the actual appearance of the original 
document. However, if it happens that any ol the blocks 
in the document overlap, or if the operator observes any 
portion which appears not to have been properly 
scanned or processed (steps S401 and S402 in Figure 
7C), he or she now designates such area by means of 
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the mouse or keyboard inputs 1 9 (step S403), and en- 
ters an instruction for the area to be treated in whatever 
fashion he or she considers proper (step S404). For ex- 
ample, if the scanner has erroneously identified a par- 
ticular area as a bi-level block, when it is actually a gray- 
scale block, the operator can designate the block and 
instruct that it be presented as grayscale information. 
The operator can, if he or she deems it necessary, in- 
struct that the area should be rescanned for this purpose 
(step S404). Upon receiving these instructions, the 
scanner rescans the page using the color sensor 59, if 
a rescan has been instructed (steps S405, S409 and 
S410), and modifies the contents of the color page 
memory 62 accordingly. The new color information is 
then converted into grayscale data (step S412), which 
is substituted for the previous bi-level information in the 
portion of the document image page memory 63 for the 
area in question (step S408). 

Alternatively, if the operator does not order a res- 
can, the scanner can simply convert the color informa- 
tion stored in the original color scan directly into gray- 
scale information for the designated area and substitute 
it for the information previously present in the document 
image page memory for that area (steps S409, S411, 
S412 and S408). 

Once the operator is satisfied with the displayed 
portion of the page, an instruction to that effect is en- 
tered, and the next portion of the page is displayed 
(steps S413 and S414), and the foregoing processing 
is repeated as necessary, for the new portion. If the last 
portion has now met the operator's requirements, the 
processing is ended. 

Once the operator has indicated that the page is 
satisfactory, the data for the page is ready for storage. 
In addition to the bi-level, color and grayscale data ac- 
cording to the content of the various blocks identified in 
the original document, the data for the page includes 
information derived by the block analysis algorithm, re- 
lating to page size, margins, frames, horizontal and ver- 
tical lines (their presence, size, length and location), etc. 
In addition, the algorithm used in the present embodi- 
ment defines the blocks of text, color image, etc. in such 
a manner as to exclude the purely background-color ar- 
eas of the document as much as possible, thus reducing 
the amount of information required for the various 
blocks. 

More particularly, the data for the document page 
includes the size of the page expressed as a frame, 
whose thickness or width defines the margin of the 
page. Figure 10 is a representative view of one way in 
which the structural information can be arranged. As 
shown in Figure 10, for each document the structural 
information includes a document identifier 51 which is 
also assigned to the full document image, and by means 
of which it is intended for the document to be retrieved 
once its entry into the document image management 
system is complete. In area 52, information relating to 
the document type is stored. At 53, all of the information 



for the document, and its layout within the document, is 
stored. As shown at 53, for example, for each region are 
stored a region identifier, a region type, rectangular co- 
ordinates that specify the position of the region on the 

s page (here the upper left corner ("ULC") and the lower 
right corner ("LRC") coordinates), and all related regions 
(lor example, this information may indicate the relation 
between a text block which contains the legend for a 
grayscale or full-color illustration, or for a line drawing). 

10 in Figure 10, region 1 corresponds to region 41 in Figure 
3 and, as shown in Figure 10, includes a "title" type in- 
dication, upper left-hand coordinates <0,0>, lower right- 
hand coordinates <5,40>, and no indication of related 
regions. The other regions illustrated in Figure 10 follow 

is similarly. 

According to this embodiment, the image data rep- 
resenting the page from the original document can be 
stored conveniently in a manner based on the TIFF 
standard. It is particularly contemplated that each block 

20 identified by the algorithm as containing bi-level infor- 
mation (text or line drawings), grayscale information, or 
full color information, will be stored in a respective bit 
map containing only information of the kind best suited 
to the image type of the block. The various blocks are 

25 associated together in a single image file, with the infor- 
mation in the table shown in Figure 1 0 stored in any con- 
venient form consistent with inclusion in a format based 
on the TIFF standard. 

Still further reduction of the total memory space re- 

30 quired to store the document can be achieved, by using 
OCR to reduce textual portions to ASCII codes, or by 
using standard compression techniques to best advan- 
tage where OCR, and hence use of ASCII codes, proves 
impractical (for example, in the case of text in an unrec- 

35 ognizable font). In addition, grayscale data or full-color 
data, or both, may be subjected to image compression 
by any suitable technique that offers sufficient reduction 
in data quantity to be worthwhile. Finally, any portion of 
a page whose content is not specified in any of these 

40 ways, is understood to be left blank. Thus, blank areas 
need not be stored. 

Even if the use of a vector representation for encod- 
ing some of the information (for example, Irame and hor- 
izontal and vertical line information) may increase the 

^5 complexity of the file format, nonetheless, that feature 
of the invention, particularly when combined with the 
linked bit map manner of storage and the use of ASCII 
code storage for text portions to the extent possible, has 
the advantage of greatly reducing the amount of mem- 

50 ory required, especially for documents where the major- 
ity of the space on a page is taken up by text or is blank. 

Tho Socond Embodiment And Documont Capture 
Therein 

55 

A second embodiment of a scanner 31 according 
to the invention is shown in Figures 11 and 12. This 
scanner differs from that of Figures 5 and 6 in having 
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three, instead of two, CCD sensors 58, 59 and 71. As 
in the first embodiment, two of the scanners 58 and 59 
are respectively for performing a high-resolution scan of 
a document to produce bi-level information, and a pref- 
erably lower-resolution scan to produce color informa- $ 
tion. The third CCD sensor 71 is for performing a scan 
of the document to produce grayscale information di- 
rectly, rather than having to calculate grayscale informa- 
tion from color data. Preferably, the grayscale sensor 71 
has the same resolution as the color sensor 59. io 

The scanner of Figures 11 and 12 also differs from 
that of Figures 5 and 6 as having a second movable mir- 
ror 72 or the like so that the light reflected from the doc- 
ument can be directed to any of the three sensors. A 
third page memory 73 is also provided, for grayscale da- 15 
ta. The grayscale page memory 73 represents the same 
number of pixels, with the same number of bits per pixel, 
as in each color component of the color image data. 

Figure 13 is a flowchart illustrating the operation of 
the sensor of Figure 11 . 20 

As in the first embodiment, the process in this em- 
bodiment begins with placement of a document 52 on 
the platen 51 , and entry of an instruction by the operator 
to commence scanning. First, the scanner positions 
movable mirror 56 so as to direct light from the docu- 25 
ment to the bi-level CCD sensor 58 (step S601). Con- 
sequently, this sensor outputs a digital signal consisting 
of one bit per pixel, for each scan line of the document, 
and this signal is stored in the bi-level page memory 61 , 
and this information is copied into the document image 30 
page memory. 

The scanner 31 processes this information to iden- 
tify blocks of common image type in the page (step 
S602). This analysis is again carried out using the algo- 
rithm disclosed in commonly-assigned application Seri- 35 
al No. 07/873,01 2. As in the first embodiment, any other 
algorithm which will perform the desired analysis may 
be used instead, but the mentioned one is the preferred 
manner for carrying out this part of the invention. 

The monitor 17 then is caused to display an image 40 
of one-half or more of the page at a resolution which 
ordinarily must be considerably below that with which 
the image was scanned by the bi-level scanner, be- 
cause most monitors cannot display at 200 dpi. The nec- 
essary change in resolution is effected, and the resulting 45 
lower-resolution bi-level data is supplied to the video 
memory 65 and the monitor 1 7 and displayed. It is again 
preferable that the scanner adds to the bi-level image 
data, supplemental data indicating the outlines of the re- 
gions identified by the above-mentioned algorithm along so 
with an indication in each as to what type of image con- 
tent has been identified as residing within the block. 

While or after this analysis is performed, the scan- 
ner positions movable mirrors 56 and 72 so that light 
reflected from the original will now be directed to the ss 
color scanner 59, and the document is scanned again 
to obtain color information, which is provided one line at 
a time by the color sensor to the color page memory 62 



(step S603). For each area identified as being color/ 
grayscale in the page, the bi-level information obtained 
in the first scan is replaced, in the document image page 
memory 63, with the color information obtained in the 
second scan, and the information being sent to the video 
memory 65 and the monitor 17 for display is modified in 
the same way. Thus, as the color information is re- 
ceived, the color/grayscale areas are displayed on the 
basis of the color information obtained in the second 
scan. 

As in the first embodiment, the color scan is prefer- 
ably, but not necessarily, performed at a lower resolution 
than the bi-level scan, for example, at 60-80 dots per 
inch. 

The scanner now performs an analysis, this time of 
the color data obtained in the second scan, to determine 
whether any of the color/grayscale blocks identified in 
step S602 are actually grayscale rather than color image 
blocks (step S604). This analysis is performed using the 
procedure illustrated in the flow chart of Figure 7B and 
described above. 

If any grayscale blocks are identified in this manner, 
a third scan is now performed (step S605). For this scan, 
the movable mirrors 56 and 72 are positioned so that 
the light from the document 52 is convoyed to the third 
CCD sensor 71 , the grayscale sensor. 

In this way the document page is scanned to obtain 
grayscale information, and that body of information is 
stored in the grayscale page memory 73. The grayscale 
information for the areas identified as grayscale blocks 
is substituted in the document image page memory 63 
for the previously obtained color data. The monitor dis- 
play is also updated in the same way to display the gray- 
scale data for the affected blocks. 

At this point, the display on the monitor 17 should 
reflect the actual appearance of the original document. 
However, if the operator observes any portion which ap- 
pears unsatisfactory, he or she now designates such ar- 
ea by means of the mouse or keyboard inputs, and en- 
ters an instruction for the area to be treated in whatever 
fashion he or she considers proper (step S606). This 
portion of the processing is the same as in the first em- 
bodiment, and therefore will not be described in detail 
again. 

Once the displayed portion of the page has been 
completed to the operator's satisfaction, the operator 
enters an instruction to end processing of that portion of 
the page. If the entire page is now satisfactory, the pro- 
cedure ends, while otherwise, another portion of the 
page is displayed for the operator's review and correc- 
tion. 

Once the operator has indicated that the page is 
satisfactory, the data for the page, comprising bi-level, 
color and grayscale data according to the content of the 
original document, is ready for storage. Storage of the 
information for the page is performed in the same man- 
ner as in the first embodiment. 
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The Third Embodiment And Document Capture Therein 

Figures 14 and 15 show a third embodiment ol the 
scanner of the invention. This embodiment is the same 
as that of Figure 5 in most respects, and accordingly, 
only the differences will be described. 

This embodiment has only two CCD sensors 58 and 
73. one of high and one of low resolution. In this embod- 
iment, the low-resolution CCD sensor 73 is capable of 
providing both color and grayscale image information. 
Accordingly, this embodiment has a page memory 74 to 
receive the grayscale image data read by the second 
CCD sensor 73 from the document. 

The second CCD sensor 73 differs from the low-res- 
olution color sensor 59 of the first embodiment, in not 
having color filters of different colors covering the light- 
receiving surface of consecutive pixels. Instead, there 
are provided three color filters, red, green and blue 75R,. 
75G and 75B : which can selectively be moved into a 
position such as Lo intercept the light as il travels from 
the light source 53 to the document 52 lying on the plat- 
en 51 (in Figure 1 5 : the red filler 75R is in this position). 
When the red filler is so positioned, the document is 
scanned with red light instead of with white light. Con- 
sequently, the charge accumulated in the CCD sensor 
73 and the signal read out from the sensor 73 represent 
the red color component of the document image infor- 
mation. The information contained in that signal is of 
course stored in the red page memory 62R After the 
information of one color component has been read in 
this way, the first filter 75 R is moved out of the way, and 
one of the other two 75G or 75B is interposed in the light 
path. Since the information for a given color component 
for an entire page is thus received without interruption 
by other information in this embodiment, each of the 
three color-component page memories 62R. 62G and 
62B is filled before data begins to be supplied to the 
next. This is different from the first and second embod- 
iments, in which each pixel of the color CCD sensor pro- 
duces information relating to a different color compo- 
nent from those lo which its two immediate neighbors 
relate, requiring demultiplexing of the resulting signal in- 
to the three color-component page memories in those 
embodiments. 

As in the first and second embodiments the proc- 
ess in this embodiment begins with placement of a doc- 
ument on the platen 51, and entry of an instruction by 
the operator to commence scanning. First, the scanner 
positions the movable mirror 56 so as to direct light from 
the document to the bi-level CCD sensor 5B. Conse- 
quently, this sensor outputs a digital signal consisting of 
one bit per pixel, for each scan line of the document, 
and this signal is stored in the bi-lcvcl page memory 61 
and copied into the document image page memory 63. 

The scanner 31 processes this information to iden- 
tify blocks of common image type in the page. This anal- 
ysis is again carried out using the algorithm disclosed 
in commonly-assigned application Serial No. 



20 

07/873,012. As in the first two embodiments, any other 
algorithm which will perform the desired analysis may 
be used instead, but the mentioned one is the preferred 
manner for carrying out this part of the invention. 
5 The monitor 1 7 then is caused to display an image 
of one-half or more of the page at a resolution which 
ordinarily must be considerably below that with which 
the image was scanned by the bi-level scanner, be- 
cause most monitors cannot display at 200 dpi. The nec- 

10 essary change in resolution is effected, and the resulting 
lower-resolution bi-level data is supplied to the monitor 
and displayed. It is again preferable that the scanner 
adds to the bi-level image data, supplemental data indi- 
cating the outlines of the regions identified by the above- 

15 mentioned algorithm along with an indication in each as 
to what type of image content has been identified as re- 
siding within the block. 

While or after this analysis is performed, the scan- 
ner moves the movable mirror 56 so that light reflected 

20 from the original will now be directed lo the second CCD 
sensor 73. After the completion of the analysis, the doc- 
ument 52 is scanned again to obtain color information. 
Actually, three scans of the page are now performed, 
each being done with a different one of the three filters 

25 75R > 75G and 75B in place, and each providing infor- 
mation of only one color component. The resulting in- 
formation is provided one line at a time by sensor 73 to 
the color page memory 62. For each area identified as 
being color/grayscale in the page, the bi-level informa- 

30 tion obtained in the first scan is replaced, in the docu- 
ment image page memory 63, with the color information 
obtained in the color scanning, and the information be- 
ing sent to the video memory 65 and the monitor 17 for 
display is modified in the same way. Thus, once the in- 

35 formation for all three color components is received, the 
color/grayscale areas are displayed on the basis of the 
color information obtained. 

As in the previous embodiments, the color scanning 
is preferably, but not necessarily, performed at a lower 

40 resolution than the bi-level scan, for example, at 60-80 
dots per inch. 

The scanner now performs an analysis of the color 
data to determine whether any of the color/grayscale 
blocks are actually grayscale rather than color image 

45 blocks. This analysis is performed using the procedure 
illustrated in the flow chart of Figure 7B and described 
above. 

If any grayscale blocks are identified in this manner, 
a third scan is now performed. For this scan, the mova- 

50 ble mirror 56 is left in position such that the light from 
the document is conveyed to the second CCD sensor 
73, but the color filters 75 are all withdrawn from the light 
path, so that tho light received by the second CCD sen- 
sor 73 represents grayscale information, rather than 

ss color information. 

In this way the document page is scanned to obtain 
grayscale information, and that body of information is 
stored in the grayscale page memory 74. The grayscale 
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information for the areas identified as grayscale blocks 
is substituted in the document image page memory 63 
for the previously obtained color data. The monitor dis- 
play is also updated in the same way to display the gray- 
scale data for the affected blocks. 

At this point, as in the previous embodiments, the 
operator reviews the processed document and, if any 
portion of the document appears unsatisfactory, he or 
she now designates such area by means of the mouse 
or keyboard inputs 19, and enters an instruction for the 
area to be treated in whatever fashion he or she consid- 
ers proper. This portion of the processing is the same 
as in the first and second embodiments. 

Once the displayed portion of the page has been 
completed to the operator's satisfaction, the operator 
enters an instruction to end processing of that portion of 
the page. If the entire page is now satisfactory, the pro- 
cedure ends, while otherwise, another portion of the 
page is displayed for the operator's review and correc- 
tion. 

Once the operator has indicated that the page is 
satisfactory, the data for the page, comprising bi-level, 
color and grayscale data according to the content of the 
original document, is ready for storage. Storage of the 
information for the page is performed in the same man- 
ner as in the first embodiment. 

The Fourth Embodiment And Document Capture 
Therein 

Figures 1 6 and 1 7 show a fourth embodiment of the 
scanner of the invention. This embodiment is the same 
as that of Figure 5 in most respects, and accordingly, 
only the differences will be described. 

This embodiment has only one CCD sensor 76. The 
resolution of this CCD sensor 76 is equal to the highest 
resolution it is desired to obtain; typically, that will be the 
bi-level data, as described above. In this embodiment, 
the single CCD sensor 76 is capable of providing color, 
grayscale and bi-level image information. Accordingly, 
this embodiment has a page memory to receive the 
grayscale image data 74 read by the CCD sensor 76 
from the document. 

The CCD sensor 76 in this embodiment may be like 
the high-resolution sensor of the first embodiment. To 
obtain the high-resolution bi-level data, the CCD sensor 
76 is operated exactly as is the bi-level sensor in the 
first embodiment. 

For the lower-resolution grayscale data, the outputs 
of several pixels of the CCD sensor 76 are combined 
using, preferably, analog circuitry, that is, before digiti- 
zation of the signal. For example, if the bi-level CCD 
sensor 76 has a resolution of 200 dots per inch, and the 
desired grayscale resolution is 100 dots per inch, then 
information from two adjacent cells in the CCD 76 can 
be combined, and the analog data from two successive 
lines can be combined, for a total of four cells of infor- 
mation being combined for each grayscale pixel. 



Also, like the bi-level sensor in the first embodiment, 
the CCD sensor 76 is not provided with individual color 
filters accurately positioned on the individual pixels of 
the sensor. Instead, as in the embodiment of Figures 14 
s and 15, there are provided three color filters, red, green 
and blue 75R, 75G and 75B, which can selectively be 
moved into a position such as to intercept the light as it 
travels from the light source to the document 52 lying 
on the platen 51 (in Figure 16, the red filter 75R is in this 
io position). When the red filter 75R is so positioned, the 
document 52 is scanned with red light instead of with 
white light. Consequently, the charge accumulated in 
the CCD sensor 76 and the signal read out from the sen- 
sor represent the red color component of the document 
75 image information. The information contained in that 
signal is of course stored in the red page memory 62R. 
After the information of one color component has been 
read in this way, the first filter 75 R is moved out of the 
way, and one of the other two 75G or 75B is interposed 
in the light path. Since the information for a given color 
component for an entire page is thus received without 
interruption by other information in this embodiment, 
each of the three color-component page memories 62R, 
62G and 62B is filled before data begins to be supplied 
to the next. This is different from the first and second 
embodiments, in which each pixel of the color CCD sen- 
sor produces information relating to a different color 
component from those to which its two immediate neigh- 
bors relate, requiring demultiplexing of the resulting sig- 
nal into the three color-component page memories in 
those embodiments. In this embodiment, preferably, the 
outputs of plural adjacent pixels, and of an equal number 
of successive rows, are combined as in obtaining the 
grayscale data. This produces color data of tower reso- 
lution than the bi-level data. 

As in the first, second and third embodiments, the 
process in this embodiment begins with placement of a 
document on the platen 51 , and entry of an instruction 
by the operator to commence scanning. First, the CCD 
sensor 76 outputs a digital signal consisting of one bit 
per pixel, for each scan line of the document, and this 
signal is stored in the bi-level page memory 61 and cop- 
ied into the document image page memory 63. 

The scanner processes this information to identify 
blocks of common image type in the page. This analysis 
is again carried out using the algorithm disclosed in EP- 
A-0567344 (US Serial No. 07/873,012). As in the first 
three embodiments, any other algorithm which will per- 
form the desired analysis may be used instead, but the 
mentioned one is the preferred manner for carrying out 
this part of the invention. 

The monitor then is caused to display an image of 
one-half or more of the page at a resolution which ordi- 
narily must be considerably below that with which the 
image was scanned by the scanner 76, because most 
monitors cannot display at 200 dpi. The necessary 
change in resolution is effected, and the resulting lower- 
resolution bi-level data is supplied to the monitor and 
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displayed. It is again preferable that the scanner adds 
to the bi-level image data, supplemental data indicating 
the outlines of the regions identified by the above-men- 
tioned algorithm along with an indication in each as to 
what type of image content has been identified as resid- 5 
ing within the block. 

After the completion of the analysis, the document 
is scanned again to obtain color information. Actually, 
three scans of the page 52 are now performed, each 
being done with a different one of the three filters 75R, 10 
75G and 75B in place, and each providing information 
of only one color component. The resulting information 
is provided one line at a time by the CCD sensor 76 to 
the color page memory 62. For each area identified as 
being color/grayscale in the page, the bi-level informa- 75 
tion obtained in the first scan is replaced, in the docu- 
ment image page memory 63, with the color information 
obtained in the color scanning, and the information be- 
ing sent to the video memory 65 and th e mon itor for dis- 
play is modified in the same way. Thus, once the infor- 20 
mation for all three color components is received, the 
color/grayscale areas are displayed on the basis of the 
color information obtained. 

As in the previous embodiments, the color scanning 
is preferably, but not necessarily-, performed at a lower 25 
resolution than the bi-level scan, for example, at 60-80 
dots per inch. 

The scanner now performs an analysis of the color 
data to determine whether any of the color/grayscale 
blocks are actually grayscale rather than color image 30 
blocks. This analysis is performed using the procedure 
illustrated in the flow chart of Figure 7B and described 
above. 

If any grayscale blocks are identified in this manner, 
a third scan is now performed. For this scan, the color 35 
filters 75 are all withdrawn from the light path, so that 
the light received by the CCD sensor 76 represents 
grayscale information, rather than color information. 

In this way the document page 52 is scanned to ob- 
tain grayscale information, and that body of information 40 
is stored in the grayscale page memory 74. The gray- 
scale information for the areas identified as grayscale 
blocks is substituted in the document image page mem- 
ory 63 for the previously obtained color data. The mon- 
itor display is also updated in the same way to display 45 
the grayscale data for the affected blocks. 

At this point, as in the previous embodiments, the 
operator reviews the processed document and, if any 
portion of the document appears unsatisfactory, he or 
she now designates such area by means of the mouse so 
or keyboard inputs, and enters an instruction for the area 
to be treated in whatever fashion he or she considers 
proper. This portion of the processing is the samo as in 
the first, second and third embodiments. 

Once the displayed portion of the page has been ss 
completed to the operator's satisfaction, the operator 
enters an instruction to end processing of that portion of 
the page. If the entire page is now satisfactory, the pro- 
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cedure ends, while otherwise, another portion of the 
page is displayed for the operator's review and correc- 
tion. 

Once the operator has indicated that the page is 
satisfactory, the data for the page, comprising bi-level, 
color and grayscale data according to the content of the 
original document, is ready for storage. Storage of the 
information for the page is performed in the same man- 
ner as in the first embodiment. 

The invention has been described with reference to 
several embodiments. Many modifications and varia- 
tions, however, also are within the scope of what the 
present inventors regard as their invention. Some will 
now be mentioned briefly. 

First, the scanner in the foregoing embodiments 
performs the processing described above. It is within the 
scope of the invention for the CPU 11 of the document 
image management system of which the scanner forms 
a part, to perform some or all of that processing. 

Also, during analysis of the color image information 
to identify grayscale areas, as described above, an area 
is so identified only if the scanner finds that the entire 
block appears to be grayscale image, using the criterion 
described above with reference to Figure 7B. As an al- 
ternative, the scanner may keep track of the value Jj for 
each pixel in the block being examined, and if it identifies 
a region of contiguous pixels within the color block as 
each separately meeting the criterion for grayscale im- 
age (Jj < T), the scanner presents the operator with an 
outline on the monitor display of the region defined by 
the contiguous grayscale pixels and requests the oper- 
ator's instruction as to whether the region so indicated 
should be converted to grayscale data or should be re- 
tained as color data. 

In addition, while in the foregoing description, bi-lev- 
el image data is discussed in terms of being black on a 
white background, it will be appreciated by those skilled 
in the art that the present invention is fully applicable to 
documents in which text, line drawings, etc. are printed 
in some other color than black that makes a sufficiently 
high contrast with the background, as for example a dark 
blue on a white background. In such case : it is within the 
scope of the invention for the apparatus described 
above to identify the color in the color scan, and to store 
an indication of that color with the bi-level information. 

As another alternative, while the scanner of the 
three embodiments described in detail above is shown 
in Figures 1 and 2 as being a part of a larger document 
image management apparatus and system, the scan- 
ner, monitor and optical disk system could be used alone 
for capturing document images, if the scanner is provid- 
ed with the capability of performing all of the control 
functions required by those components. 

Also, while the foregoing embodiments provide the 
user with the ability to obtain bi-level, full-color and gray- 
scale information for a document, there may be appli- 
cations in which it is sufficient to obtain just bi-level and 
color, without grayscale, or just bi-level and grayscale, 



EP 0 736 999 A2 



75 



20 



25 



OCID: <EP 0736999A2_I_> 



13 



25 



EP 0 736 999 A2 



26 



without color information. In any such cases, the fore- 
going embodiments can be modified by removing the 
unneeded capabilities, thus resulting in a reduction of 
hardware and software complexity. 

While the present invention has been described 
with reference to the preferred embodiments, many 
modifications and variations of those embodiments will 
now be apparent to those skilled in the art. Accordingly, 
the scope of the present invention is not to be limited by 
the details of the embodiments described herein, but on- 
ly by the terms of the appended claims. 



Claims 

1. An image scanning apparatus comprising: 

a sensor having multiple pixels which, in re- 
sponse to exposure to light, output signals 
which, for each pixel, vary as a function of how 
much light that pixel received in said exposure; 
a color filter system; and 

control circuitry which controls said sensor and 
said color filter system such that, in a first mode, 
the signals output by said sensor represent 
color information, and in a second mode, the 
signals output by said sensor represent only lu- 
minance information. 

2. An image scanning apparatus according to Claim 
1, wherein said color filter system is movable be- 
tween at least a first position and a second position, 
said first position being one in which said color filter 
system is so located that light incident on the pixels 
of the sensor first passes through the color filter sys- 
tem, and said second position being one in which 
said color filter system is so located that light inci- 
dent on the pixels of the sensor does not pass 
through the color filter system. 

3. An image scanning apparatus according to Claim 1 
or 2, wherein said sensor is constructed to permit 
outputs from plural pixels to be combined into a sin- 
gle output signal, thereby reducing resolution of in- 
formation conveyed by the signals output by said 
sensor and wherein said control circuitry selective- 
ly causes said sensor to operate with or without 
such combination of outputs. 

4. An image scanning apparatus, comprising: 

a first scanning device; 
a second scanning device; and 
a control system structured and arranged to ef- 
fect a first scan of an image, in which first scan 
the image is scanned only by said first scanning 
device, and then a second scan of the image, 
in which second scan the image is scanned on- 
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ly by said second scanning device. 

5. An image scanning apparatus, comprising: 

a first scanning device; 
a second scanning device; and 
an image-content detector, structured and ar- 
ranged to detect image-type of an image, 
based on data obtained as a result of a scan of 
the image by said first scanning device; and 
a control system, for controlling said first and 
second scanning devices, 
wherein said control system operates to effect 
a first scan of an image, in which first scan the 
image is scanned by said first scanning device, 
after which, responsive to a detection by said 
image-content detector that the image contains 
image content of a particular type, said control 
system operates to effect a second scan of the 
image, in which second scan the image is 
scanned by said second scanning device. 

6. An image scanning apparatus, comprising: 

a first sensor; 

a second sensor; 

a scan mechanism; and 

a control system, for controlling said first and 
second sensors and said scan mechanism, 
wherein said control system causes said scan 
mechanism to effect a first scan of an image, in 
which first scan the image is scanned only by 
said first sensor, after which said control sys- 
tem causes said scan mechanism to effect a 
second scan of the image, in which second 
scan the image is scanned only by said second 
sensor. 

7. An image scanning apparatus, comprising: 

a first sensor; 
a second sensor; 
a scan mechanism; 

an image-content detector, structured and ar- 
ranged to detect image-type of an image, 
based on data obtained as a result of said scan 
mechanism effecting a scan of the image by 
said first sensor; and 

a control system, for controlling said first and 
second sensors and said scan mechanism, 
wherein said control system causes said scan 
mechanism to effect a first scan of an image, in 
which first scan the image is scanned by said 
first sensor, after which, responsive to a detec- 
tion by said image-content detector that the im- 
age contains image content of a particular type, 
said control system causes said scan mecha- 
nism to effect a second scan of the image, in 
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which second scan the image is scanned by 
said second sensor. 

8. An image scanning apparatus, comprising: 

5 

a first sensor; 
a second sensor; 
a scan mechanism; 

a memory, tor storage of image data represent- 
ing the image; and 10 
an analysis and control system, structured and 
arranged to detect image-type of an image, 
based on data obtained as a result of said scan 
mechanism effecting a scan of the image by 
said first sensor, and to identify particular por- is 
tions of the image as being of respective image- 
types, and for controlling said first and second 
sensors, said scan mechanism and said mem- 
ory, 

wherein said analysis and control system caus- 20 
es said scan mechanism to effect a first scan 
of an image, in which first scan the image is 
scanned by said first sensor, after which, re- 
sponsive to a determination by said analysts 
and control system that the image contains im- 2s 
age content of a particular type in at least one 
portion, said analysis and control system caus- 
es said scan mechanism to effect a second 
scan of the image, in which second scan the 
image is scanned by said second sensor, and 30 
wherein said analysis and control system ini- 
tially causes to be stored in said memory image 
data obtained by said first sensor for all portions 
of the image identified, and also causes to be 
stored in said memory image data obtained by 35 
said second sensor, only for portions of the im- 
age identified as being the particular image- 
type. 

9. An image scanning apparatus, comprising: 



10. An image scanning apparatus, comprising: 

a first sensor; 
a second sensor; 
a scan mechanism; 
a display; and 

an analysis and control system, structured and 
arranged to detect image-type of an image, 
based on data obtained as a result of said scan 
mechanism effecting a scan of the image by 
said first sensor, and to identify particular por- 
tions of the image as being of respective image- 
types, and for controlling said first and second 
sensors, said scan mechanism and said dis- 
play, 

wherein said analysis and control system caus- 
es said scan mechanism to effect a first scan 
of an image, in which first scan the image is 
scanned by said first sensor, after which, re- 
sponsive to a determination by said analysis 
and control system that the image contains im- 
age content of a first and second types in at 
least first and second respective portions, said 
control system causes said scan mechanism to 
effect a second scan of the image, in which sec- 
ond scan the image is scanned by said second 
sensor, and 

wherein said analysis and control system ini- 
tially causes the image to be displayed by said 
display, using image data obtained by said first 
sensor for portions of the image identified as 
being of the first image-type, and using image 
data obtained by said second sensor, only for 
portions of the image identified as being the 
second image-type. 

11. An image scanning apparatus, comprising: 



a first sensor; 
40 a second sensor; 

a scan mechanism; 

a memory, for storage of image data represent- 
ing the image; and 

an analysis and control system, structured and 
45 arranged to detect image-type of an image, 

based on data obtained as a result of said scan 
mechanism effecting a scan of the image by 
said first sensor, and to identify particular por- 
tions of the image as being of respective image- 
50 types, and for controlling said first and second 

sensors, said scan mechanism and said mem- 
ory, 

wherein said analysis and control system caus- 
es said scan mechanism to effect a first scan 
55 of an image, in which first scan the image is 

scanned by said first sensor, after which, re- 
sponsive to a determination by said analysis 
and control system that the image contains im- 



a first sensor; 
a second sensor; 
a scan mechanism; 
a display; and 

a control system, for controlling said first and 
second sensors, said display and said scan 
mechanism, 

wherein said control system causes said scan 
mechanism to effect a first scan of an image, in 
which first scan the image is scanned by said 
first sensor, and then causes said display to dis- 
play image data obtained in the first scan, after 
which said control system causes said scan 
mechanism to effect a second scan of the im- 
age responsive to entry of a second-scan in- 
struction by an operator, in which second scan 
the image is scanned by said second sensor. 
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age content of a particular type in at least one 
portion, said control system causes said scan 
mechanism to effect a second scan of the im- 
age, in which second scan the image is 
scanned by said second sensor, and 
wherein said analysis and control system ini- 
tially causes to be stored in said memory image 
data obtained by said first sensor for all portions 
of the image identified, and causes to be stored 
in said memory image data obtained by said 
second sensor, only for portions of the image 
identified as being the particular image-type, 
and 

wherein the image data is stored in said mem- 
ory in respective bit maps for respective por- 
tions of the image, the respective bit maps be- 
ing linked in said memory. 

12. An image scanning apparatus according to any of 
Claims 4 to 11, further comprising a third sensor, 
and wherein said control system causes said scan 
mechanism to effect a third scan of the image in 
which third scan the image is scanned only by said 
third sensor. 

1 3. An image scanning apparatus according to any of 
Claims 4 to 11 , wherein one of said sensors outputs 
bi-level image information and the other of said sen- 
sors outputs color image information. 

14. An image scanning apparatus acccording to Claim 
1 3, wherein said other of said sensors outputs gray- 
scale image information also. 

15. An image scanning apparatus, comprising: 

a color sensor; 
a scan mechanism; 

a memory, for storage of image data represent- 
ing the image; and 

an analysis and control system, structured and 
arranged to detect grayscale portions of an im- 
age, based on data obtained as a result of said 
scan mechanism effecting a scan of the image 
by said color sensor, and to identify particular 
portions of the image as being of respective im- 
age-types, and for controlling said color sensor, 
said scan mechanism and said memory, 
wherein said analysis and control system caus- 
es said scan mechanism to effect a first scan 
of an image, in which first scan the image is 
scanned by said color sensor, after which, re- 
sponsive to a detection that the image contains 
grayscale image in at least one portion, said 
analysis and control system converts the color 
image data obtained for that portion by said 
color scanner to grayscale data, and 
wherein said analysis and control system caus- 
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es to be stored in said memory image data ob- 
tained by said color sensor for non-grayscale 
portions of the image and causes to be stored 
in said memory the grayscale image data for 
5 any portion of the image identified as being 

grayscale image. 

16. An image scanning apparatus, comprising: 

10 a sensor; 

a memory; and 

a control system which, based on information 
from said sensor, identifies portions of a docu- 
ment as being of respective image types, and 

is which causes image data representing the doc- 

ument to be stored in said memory, 
wherein said control system organizes the im- 
age data in a set of linked bit maps each con- 
taining information of only one image type and 

20 pertaining to only one of the identified portions 

of the document. 

17. An image scanning apparatus according to Claim 

16, wherein said control system further organizes 
25 the image data representing the document to in- 
clude in vector representation, information repre- 
senting any frames, horizontal straight lines and 
vertical straight lines in the document. 

30 18. An image scanning apparatus according to Claim 

17, wherein said control system further organizes 
the image data representing the document to ex- 
clude from the bit maps information representing 
background portions of the document. 

35 

19. An image scanning method comprising the steps of: 

effecting a first scan of an image, in which first 
scan the image is scanned only by a first sen- 
40 sor; 

effecting a second scan of the image, in which 
second scan the image is scanned only by a 
second sensor. 

45 20. An image scanning method, comprising the steps 

of: 

effecting a first scan of an image, in which first 
scan the image is scanned by only a first sen- 

50 sor; 

detecting image-type of an image, based on da- 
ta obtained as a result of the first scan; and 
effecting a second scan of the image, in which 
second scan the image is scanned only by a 

55 second sensor, responsive to a detection in 

said detecting step that the image contains im- 
age content of a particular type. 
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21. An image scanning method, comprising the steps 
of: 

effecting a first scan of an image, in which first 
scan an image is scanned only by a first sensor; $ 
detecting image-type of the image, based on 
data obtained as a result of said effecting step, 
and identifying particular portions of the image 
as being of respective image-types; 
effecting a second scan of the image, in which 10 
second scan the image is scanned only by a 
second sensor, responsive to a determination 
in said detecting and identifying step that the 
image contains image content of a particular 
type in at least one portion; is 
storing in a memory image data obtained in the 
first scan for all portions of the image identified; 
and 

storing in a memory image data obtained in the 
second scan, only for portions of the image 20 
identified as being the particular image-type. 

22. An image scanning method : comprising the steps 
of: 

25 

effecting a first scan of an image, in which first 
scan the image is scanned only by a first sen- 
sor; 

displaying image data obtained in the first scan; 
effecting a second scan of the image respon- 30 
sive to entry of a second-scan instruction by an 
operator, in which second scan the image is 
scanned by only a second sensor 

23. An image scanning method, comprising the steps 35 
of: 

a first sensor; 
a second sensor; 

a scan mechanism; 40 
a display; and 

effecting a scan of the image using only a first 
sensor; 

detecting image-type of an image, based on da- 
ta obtained in the first scan, and identifying par- 45 
ticular portions of the image as being of respec- 
tive image-types; 

effecting a second scan of the image, in which 
second scan the image is scanned using only 
a second sensor, responsive to a determination so 
by said analysis and control system that the im- 
age contains image content of a first and sec- 
ond types in at least first and second respective 
portions; and 

displaying the image using image data ob- ss 
tained in the first scan for portions of the image 
identified as being of the first image-type, and 
using image data obtained in the second scan, 



only for portions of the image identified as being 
the second image-type. 

24. An image scanning method, comprising the steps 
of: 

effecting a first scan of an image, in which first 
scan the image is scanned using only a first 
sensor; 

detecting image-type of an image, based on da- 
ta obtained in the first scan, and identifying par- 
ticular portions of the image as being of respec- 
tive image-types; 

effecting a second scan of the image, in which 
second scan the image is scanned only using 
a second sensor, responsive to a determination 
in said detecting and identifying step that the 
image contains image content of a particular 
type in at least one portion; and 
initially storing in a memory image data ob- 
tained in the first scan for all portions of the im- 
age identified, and storing in the memory image 
data obtained in the second scan, only for por- 
tions of the image identified as being the par- 
ticular image-type, wherein the image data is 
stored in respective bit maps for respective por- 
tions of the image, the respective bit maps be- 
ing linked in the memory. 

25. An image scanning method according to any of 
Claims 19 to 24, further comprising the step of ef- 
fecting a third scan of the image, in which third scan 
the image is scanned using only a third sensor. 

26. An image scanning method according to any of 
Claims 1 9 to 24, wherein one of the scans produces 
bi-level image information and the other of the 
scans produces color image information. 

27. An image scanning method acccording to Claim 26, 
wherein said other of the scans produces grayscale 
image information also. 

28. An image scanning method, comprising the steps 
of: 

effecting a first scan of an image, in which first 
scan the image is scanned using a color sen- 
sor; 

detecting grayscale portions of an image, 
based on data obtained from the scan of the 
image using the color sensor, and identifying 
particular portions of the image as boing of re- 
spective image-types; 

responsive to a detection that the image con- 
tains grayscale image in at least one portion, 
converting the color image data obtained for 
that portion by said color scanner to grayscale 
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data; and 

storing in a memory image data obtained using 
the color sensor for non-grayscale portions of 
the image and storing in the memory the gray- 
scale image data for any portion of the image s 
identified as being grayscale image. 

29. An image scanning method, comprising the steps 
of: 

w 

identifying portions of a document as being of 
respective image types, based on information 
obtained by scanning the document; and 
storing image data representing the document 
in a memory, is 
wherein said storing step further comprises or- 
ganizing the image data for storage in a set of 
linked bit maps each containing information of 
only one image type and pertaining to only one 
of the identified portions of the document. 20 

30. An image scanning method according to Claim 29, 
wherein, in said storing step, the image data repre- 
senting the document is further organized to include 

in vector representation, information representing 2s 
any frames, horizontal straight lines and vertical 
straight lines in the document. 

31. An image scanning method according to Claim 30, 
wherein, in said storing step, the image data repre- 30 
senting the document is further organized to ex- 
clude from the bit maps information representing 
background portions of the document. 

32. A method or apparatus having the features of any 35 
combination of the preceding claims. 
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42 I Canon Wants Mutually 



! Rewarding Coexistence 



i 



i Source: Fortune, 7/29/91 
1 

! Ryuzaburo Kaku, Chairman of the Board of 

jCanon, Inc. in bis recent interview bad the following to 
jsay about the corporate world. The world is divided into 
[four types of companies: 

i (1) Purely capitalistic enterprises that exploit 

i their workers for profit. 

(2) Those where management and labor work 
^closely together to maximize profits, but don't pay enough 
[attention to the community 

j (3) A company that both tries to make money 

i but also seeks to fulfill its corporate responsibilities to . 
■society, but in a small scale way to a particular country of 
{region. 

j (4) A highly evolved type of company that 

i contributes positively to world prosperity. 

i Canon is aspiring to be the fourth type of cora- 

ipany. This is a company that is socially responsible and 
- -[^practices good corporate citizenship at home and overseas 
land that can be referred to as a true global corporation, 
i We have a basic philosophy to achieve a mutually reward- 
ting coexistence among employees, shareholders, cus- 
tomers and the communities in which we do business. 




Canon Develops World's First 
Ferroelectric Liquid Crystal Display 



Source: 



Wall Street Journal. 10/2/91, 
Canon Press Release. 10/1/91 



In a news con Terence yesterday, Hiroshi Tanaka, a 
Canon senior managing director said that the company has suc- 
ceeded in developing the world's first ferroelectric liquid crystal 
(FLC) display screen. The screen will be test marketed next 
spring in Canon's EZPS Japanese language DTP system. 




r 



Canon's Corporate Culture to Blend 
Best of U.S. & Japan 



■ Source: Fortune, 872679 1 

! Mr. Hideharu Takemoto, President of Canon 

iU.S-A. was recently interviewed by Fortune Magazine 
Jand had the following to say about Canon in North ^ 
i America. "* 
j Mr. Takemoto wants to create a new Canon 

i corporate culture, "The best of American and Japanese 
| cultures must be blended to produce a richer corporate 
i alchemy —a new ideal." 

j a Canon wants to create more jobs for 
(Americans in the 1990s and to make them an integral 
•part of the Canon family. Mr. Takemoto is further 
[committed to cultivating local talent and moving local ( 
i executives up through the ranks of Canon's highest cor-i 
jporaie echelons. 1 
i 



Ink-Jet Printer Market Share 

Source: Computer Reseller Ncws/Jofo Cor? ' 
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(54) Method and apparatus for scanning a document 



(57) A document image capture method and scan- 
ner, and an image processing apparatus incorporating 
such a scanner, in which a document is scanned two or 
more times. The first scan preferably provides bi-level 
image data : which is analyzed to identify blocks of uni- 
form image type (for example, text, line drawing, gray- 
scale image, or full-color image) within the document. 
The second scan, preferably performed at lower reso- 
lution than the first, provides grayscale or color informa- 
tion, which is substituted in the grayscale or color blocks, 
respectively, for the bi-level information obtained in the 
first scan. A third scan, to provide information of the third 
type, may also be performed. An operator preferably 



views an image of the document, based on the scanned 
information, to be sure that the identification and typing 
of the various blocks has been done correctly, and may 
instruct that the document be rescanned to provide new 
data for a designated portion of the document image, if 
it appears that an error has occurred. The information 
representing the document image obtained in this way 
is preferably stored using a set of linked bit maps, one 
bit map for each block. The memory capacity needed to 
store the information can be reduced further by treating 
the page and its margins as a frame, and by storing in- 
formation about the frame, and any horizontal or vertical 
lines in the document, in simple vector form. Any portion 
of the document which is just background is not stored. 
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